Biology helps to construct weighted scale free networks 
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Abstract 

In this work we study a simple evolutionary model of bipartite networks which its evolution is 
J^ ■ based on the duplication of nodes. Using analytical results along with numerical simulation of the 



model, we show that the above evolutionary model results in weighted scale free networks. Indeed 
we find that in the one mode picture we have weighted networks with scale free distributions for 
interesting quantities like the weights, the degrees and the weighted degrees of the nodes and the 
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weights of the edges. 
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I. INTRODUCTION 

Most of interacting systems can be regarded as complex networks [l|, |2|, |3|. Certainly, 
seeking the structural and universal properties of these networks is a main goal in studying 
the behavior of these systems 14 N, rcl M, M . Among these one can refer to the small-world 

"* '^ ' J n 

phenomenon [4J and the scale free behavior of degree distribution [1], where degree denotes 
the number of nearest neighbors of a node. Clearly finding the basic ingredients to produce 
such behaviors helps us in a better understanding of real networks. For example it is now 
clear that a simple evolution of networks in which new nodes prefer to be connected to higher 
degree nodes, could give rise to scale free networks with a power law degree distribution 
(P(k) ~ fc -7 ) [5|. The above process seems a natural rule in the evolution of most of the 
real networks and one can even measure the tendency of new nodes to have a preferential 
attachment J9(. 



An interesting feature of real networks is that they are complex weighted networks lOL 111 | . 
For example we can associate a weight to each node of a network which might represent 
the size or power of that node to create connections with the other nodes. In a protein 
complex network this weight is the number of proteins attributed to a protein complex 
J12L Il3l| . We could also assign a weight to each edge of a network which might be a measure 
of interaction between the end point nodes of the edges in the network. In the example 
of protein complex network this weight shows the number of proteins that two protein 
complexes have in common. The weight of an edge in this case would be an appropriate 
measure to quantify the functional correlation of two protein complexes connected by that 
edge. In the same wa y o ne could consider social collaboration networks, e.g. scientific 
coauthorship networks |l4i Il5l. Il6{]. as weighted networks. In this situation the weight of a 
node, which represents an author, gives the number of articles written by that author and 
the weight of an edge between two nodes is the number of articles that the corresponding 
authors have coauthored together. It is reasonable to think that two authors with a larger 
number of articles in common, would have a larger communication and so would be closer to 
each other than to the other authors. As another example one may take the social network 
of communities or groups in a society. Each group has a weight which represents the number 
of its members and the weight of an edge connecting two groups gives the number of shared 
members. Certainly two groups with a larger number of members in common have a larger 



interaction with each other and so a higher probability to transmit any kind of information 

between one another. 

As the above paragraph reveals, there are a large number of weig hted networks which can 
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17(. For example in the case 



be exhibited as a one mode picture of a bipartite network 
of the protein complex network we could make a bipartite network of proteins and protein 
complexes. An edge in this bipartite network only connects a protein (a node of type I) 
to a protein complex (a node of type II) and means that this protein is a member of the 
associated protein complex. In the same way one can construct the bipartite network of a 
scientific coauthorship network where nodes of type I and II represent the articles and the 
authors respectively. 

The presence of power law distributions with exponents around 2 is a main characteristic of 
the above studied networks |ldl . ll3lll4lll6l .ll7|. Here the relevant distributions are the weight, 
the degree and the weighted degree distribution of the nodes and the weight distribution of 
the edges in the one mode picture. We define the weighted degree of a node as the sum of 
the weights of the edges emanating from that node. Thus one can ask if there is a simple 
rule for the evolution of bipartite networks which reproduces the basic features of the above 
complex weighted networks? 

In this paper we study a simple model for the evolution of bipartite networks. To this end 
we apply a well known rule of biology in the context of protein evolution, that is duplication 
of proteins, to the evolution of bipartite networks. It has been shown that this mechanism 
can well rep roduce the structural properties of the protein interaction networks [la, Il9| . 
In Ref. |li| the same procedure has been applied to simulate the evolution of a protein 
complex network. Let us illustrate the duplication mechanism in the example of a scientific 
coauthorship network; A new article in this network could well be assumed as a result of an 
old article (it has been duplicated) with some changes probably in the list of the authors (its 
connections have undergone mutation). This new article may also introduce a new author 
to the list of present authors (it creates a new node of the other type). Note that an author 
with a higher number of articles has a larger probability to produce a new article. This 
feature automatically enters the model if we select randomly an article for duplication. This 
is an important property of duplication mechanism in the bipartite networks which results 
in the emergence of scale free distributions. 
In the following we will permit both types of the nodes to have the chance of duplication. 



Note that this event is meaningless in the example of scientific coauthorship network in 
which the authors of an article has been fixed at the time of its birth. However, in other 
examples such as the social network of groups, it is a reasonable event where a new group 
might form as a duplication of an already present group. Here for the sake of simplicity 
we only consider the simple case of pure duplication of the nodes. Finally we shall take 
into account the limited age of the nodes which prevents them from having connections 
with the new nodes. This phenomenon has an essential role in networks like the scientific 
collaboration networks where a retired person can not contribute in writing a new article. 
It turns out that the simple model introduced in this paper can generate complex weighted 
networks with scale free distributions for both the weight of the nodes and the edges and 
also for the degree and the weighted degree of the nodes in the one mode pictures. 

The paper is organized as follows. In the next section we give the model definition in 
detail. Section IIHI is devoted to the analytic study of the model along with the results of 
the numerical simulations. In section IIVI we study the effect of limited ages of the nodes 
on the behavior of the interesting quantities by means of numerical simulations. Section 
includes the conclusion remarks of the paper. 

II. THE MODEL DEFINITION 

Consider a bipartite network with n nodes of the first type and iV nodes of the second 
type. We will indices nodes of type I by small letters like a,b,c,. . . and nodes of the other 
type by capital letters, that is A,B,C,. ... In the same way each quantity will be represented 
by small or capital letter according to its relation with the type of nodes. For example by 
m a (t) and m^^) we mean respectively the weight of a first and a second type node at time 
t . Here the weight of a node is the number of its connections in the bipartite network. To 
evolve the network we go through the following rules: 

i) in each step we choose one type of nodes for duplication. With probability A a node of the 
first type an with probability 1 — A a node of the second type will be chosen for duplication, 
ii) Suppose that we have decided to duplicate a node of the first type. Now a node is 
randomly chosen to produce a copy of itself. This copy, which is the same type as the 
duplicated node, will also have the same weight and even the same set of connections with 
the other type of nodes. 




FIG. 1: A step of the evolution of a bipartite network in which node 3 of type I has been duplicated. 
The result of this duplication is node 5. 

iii) Finally we allow the new node to create a node of the other type and to connect to it. 
These processes have been shown in Fig. ^ 

Note that all the above processes occur in one time step that is from t to t + 1 and the 
same events could happen for a node of the second type. 

III. ANALYTIC STUDY OF THE MODEL 



Note that the only parameter of the model denned above is A. Moreover due to the 
symmetry of the evolution if we compute the behavior of nodes of type II we could get the 
behavior of the other type only by replacing A with 1 — A. 

As the initial condition we start at time t = 1 with one node of type I which has been 
connected to one node of type II. Thus according to the deterministic creation of the nodes, 
the number of nodes at time t will be given by n(t) = N(t) = t. 

We can write the following equation for the average weight of node A which entered the 
network at time t A <t: 

m A {t + l) = m A {t) + \m A {t)/n{t). (1) 

Indeed the second term in this equation is the probability that a node of type I which is 
connected to node A, be selected for duplication. If so the weight of node A will increase by 



one due to the connection with the new node. Note that a node of type II has the average 
weight 

N(t A -l) 

m A (t A ) = l + (l-X) ]T m B (t A -l)/N(t A -l). (2) 

B=l 

at the time of its birth. From these equations one can find the following equation for 

£i(t + 1) = £i(t) + 1 + n(t)/t, (3) 

Where £l(t) gives the average number of edges in the bipartite network. Here we have used 
the fact that n(t) = N(t) = t. In the continuum approximation where we take t as a 
continuum variable, the above equation can be rewritten as 



d£l{t) _ 
which has the solution 



(/ , - + «(*)/*, (4) 



Q(t) = £(l + ln£), (5) 

with the initial condition £1(1) = 1. Thus the average weight of a node in the network will 
increase with the logarithm of t. In the same way the solution of Eq. Stakes the following 
form in the continuum approximation 

m A (t) = m A (t A )(t/t A ) x = [1 + (1 - A)(l + \nt A )} (t/t A ) x . (6) 

For A = 1 this equation gives 

m A (t) = t/t A . (7) 

Note that having in hand this behavior we can use the conservation of probabilities to 
compute the distribution function of the weight of the nodes S(m) in the network. Indeed 
the number of nodes whose weights are between m and m + Am, i.e. S(m)Am, is equal to 
the number of nodes which have entered the network between times t A and t A + At A where 
t A is given by Eq. Note also that in each step a new node has been introduced to the 
network. Therefore we find that S(m) behaves like 

S(m) ~ m~ 2 . (8) 

On the other hand we have the following relation for m A (t) in the case of A = 

m A (t) = 2 + \nt Al (9) 




FIG. 2: Weight distribution of the nodes of type II for some values of A. The parameters are 
t = 1000, A = 1 (squares), A = 0.5 (circles) and A = (triangles). The data are result of averaging 
over 50 runs of the evolution of the model. This number is the same in all the numerical simulations 
of the model represented in this paper. The guideline shows a power law behavior of exponent 2. 



which is independent of t and only depends on the birth time of the node A. It is easy to 
show that in this case 

S{m) ~ e m . (10) 

Thus the number of nodes increases exponentially with the weight of node. But the network 
is finite and we will encounter the finite size effects for the smaller values of m in this case. 
Therefore for A = 1 we have the exponent 2 for S(m) while in the case of A = this exponent 
will be oo which reflects the exponential decay of the weight distribution due to the finite 
size of the network. These arguments have been confirmed in FigEl which shows the results 
of numerical simulations in these cases. Note that when A = we have always a node of the 
second type with weight t. This node is indeed the one we have started with it. We remind 
that in the same time the exponent of s(m), the weight distribution of nodes of type I, is 
obtained by interchanging A and 1 — A. 

Now let us consider the one mode picture of the above bipartite network constructed by the 
second type nodes. First we focus on the evolution of the degree of a node in this picture. 
Indeed the number of neighbors of node A increases by one when the duplicated node is 
a member of it in the case of first type duplication. The probability for this to happen is 
\rriA{t)/n(t). In the case of second type duplication, the number of neighbors increases only 
when the duplicated node is a neighbor of node A or the node A itself. This probability is 



given by (1 — X)(k A (t) + l)/N(t). So for node A with t A < £, we have 

fc A (< + 1) = A; A (t) + \m A (t)/n(t) + (1 - A)(fc A (*) + 1)/N(t). (11) 

In the same way one obtains the average degree of node A at the time of its birth 

n(t A -l) N(t A -l) 

k A (t A ) = \ £ ™*(tA ~ l)/n{t A - 1) + (1 - A) J2 k B (t A -l)/N(t A -l). (12) 

6=1 B=l 

But Efe=i m b {t) = Q(t) and fi(t) is given by Eq. ©. We also define L(t) := Eb=i M*) and 
use Eqs. ()ll|) and (J12JI to write the following relation for L(£) 

L(i + 1) = L{t) + 2Afi(*)M*) + 2(1 - X)L(t)/N(t) + (1 - A), (13) 

where L(t)/2 gives the number of edges in the one mode picture of the nodes of type II. 
Solving this equation in the continuum approximation we find 

L(t) = (1 + A - 2A 2 )(t 2(1 - A) - t)/(l - 2A) 2 - 2A£ln£/(l - 2A). (14) 

For A = 1 we get L(t) = 2tln(t) and for A = this behavior is replaced by L(t) = t[t — 1). 
Going back to Eq. ljllj) we are now ready to solve it in the continuum approximation 

k A (t) = [1 + (1 - A)(l + \nt A )](t/t A ) x /(2\ - 1) - 1 + C A t x -\ (15) 

where C A is a constant determined by Eq. ()12j) . For A = 1 the above relation takes the form 

k A (t)=t/t A + \n(t A -l) } (16) 

which for t — > cxd predicts a power law degree distribution of exponent 2 for the large values 
of k , that is P(k) ~ k~ 2 . In FigOlwe have shown the degree distribution for some values of 
A. Note that for A = we have a fully connected network in which k A (t) = t — 1 for all the 
nodes. 

Note also that each edge of the above network has a weight w and so one can speak of 
weighted degree of a node Z(t), which gives the sum of weights of the edges emanating from 
that node, i.e. Z A (t) = Y*B^A w AB(f)- The average of Z A (t) for t A < t is determined by 
the following considerations; First consider the case of type I duplication where m A {t)/n{t) 
gives the probability that a member of node A is selected for duplication. In this case Z A (t) 
increases by m(a\A;t) that we define as the average weight of a node of type I at time t 
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FIG. 3: Degree distribution of the nodes of type II in the one mode picture. The parameters are 
t = 1000, A = 1 (squares), A = 0.5 (circles) and A = (triangles). The guideline shows a power 
law behavior of exponent 2. 

which is connected to node A in the bipartite network. On the other hand in the case of the 
second type duplication, Z A (t) increases only when the selected node is the node A or one of 
its neighbors in the one mode picture. In the latter case Z A (t) increases by w(B\A\ t) which 
denotes the average weight of an edge emanating from node A in the one mode picture and 
in the former case it increases by m^ft). Thus we obtain 

Z A (t + 1) = Z A (t) + \m A {t)m{t\A)/n(t) + (1 - A) [m A (t)/N(t) + k A {t)w{t\A)/N(t)} . (17) 

Similarly when node A enters the network we have 

Z A {t A ) = A £m a (t A -l)/ n (tA-l) + (l-A) ]T [m B {t A - 1) + k B {t A - l)w{t A - l\B)\ /N{t A -l) 

a B 

(18) 
Let us define Q(t) := J2a Z A (t). Then Using Eqs. O and (JUI) along with n(t) = N(t) = t 
we find 

Q(t + 1) = Q(t) + (2 - \)Q{t)/t + 2Q(t)/t, (19) 

where we have used the following relations 



Z A {t) = k A {t)w{t\A) = m A (t)[m(t\A) - 1]. 



(20) 



We can solve Eq. (fTT?|l in the continuum approximation and with the initial condition 
Q(l) = to find 



Q(t) =2(2-A)t(t 



l-A 



l)/(l-A) 2 -2tlnt/(l-A). 



(21) 




FIG. 4: Distribution of weighted degree of the nodes of type II in the one mode picture. The 
parameters are t = 1000, A = 1 (squares), A = 0.5 (circles) and A = (triangles). The guideline 
shows a power law behavior of exponent 2. 



Now from Eq. (|18jl we can write 

z A (t A ) = n(t A - i)/(t A - i) + (i - x)Q(t A - i)/(t A - 1). 

Solving Eq. (jl7j) in the continuum approximation we obtain 

Z A (t)=C A t-m A (t)/(l-\), 
where C A is again a constant determined by Eq. (j22j) . For A = 1 it is easy to find that 



(22) 



(23) 



Z A (t) = [lnt + 1 + ln[(£ A - l)/t A }} t/t A . 



(24) 



As it is seen for t —>■ oo we expect a power law distribution for weighted degrees with 
exponent 2, i.e. P(Z) ~ Z~ 2 . On the other extreme that is for A = we have 



Z A (t) = [4{t A - 1) - 1 - ln[(t A - l)/t A }} t/t A - 2 - lnt A . 



(25) 



Thus for large times Z A (t) is nearly independent of t A and we find a delta like distribution 
for this quantity. The reader can check these statements in Fig0] 

Finally let us study the behavior of weight of the edges in the one mode picture. The 
average weight of the edge between two nodes A and B with t A < ts < t, increases by one 
only when the duplicated node is of type I and moreover is connected to both the nodes. 
This probability is given by \w A B(t)/n(t) thus we find 



w AB (t + 1) = w AB (t) + \w AB (t)/n(t). 



(26) 
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m A (t B - 


- 1) + X! w AC'(tB - 
C+A 


-1) 


/N(t B -l). 
(27) 



Moreover, when node B enters the network at time t B , the average of its connection weight 
with a previously present node is given by 

w AB {t B ) = Xm A (t B - l)/n{t B - 1) + (1 - A) 

Using the fact that n(t) = N(t) = t and Z A {t) = J2c^A w Ac(t) we find 

w AB {t B ) = [m A (t B - 1) + (1 - X)Z A {t B - 1)] f(t B - 1). (28) 

Thus taking advantage of the continuum approximation to solve Eq. ()26j) we find that the 
average weight of the edge between nodes A and B with t A < t B < t is 



For A = 1 we have 



w AB (t) = w AB (t B )(t/t B y 



w AB {t) =t/(t A t B ), 



where t A and t B can take integer values from 1 to t. Let us define 

t-i t 

G{x) := Y, J2 6 ^t A t B , 
t A =it B =t A +i 



(29) 



(30) 



(31) 



which is the number of edges in the network with x = t A t B . It is easy to see graphically 
that G(x) ~ x — y/x for 1 < x < t and G(x) ~ t — sfx for t < x < t 2 . Now we can use 
conservation of probabilities 



AG{x) = (1 - l/(2v^))Ax = -E(w)Aw. 



(32) 



to find the behavior of E(w), the distribution of weight of the edges, for large values of w 



E{w) ~ (t/w 2 )(l - y/w/(At)). 



(33) 



Obviously for large t the exponent of this distribution is 2. On the other hand for A = 
from Eq. (|2T)j) we see that w AB (t) = w AB (t B ). As before we expect an exponential tail for 
the weight distribution of the edges in this case. These arguments are confirmed by virtue 
of the numerical simulations in FigEl 
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FIG. 5: Weight distribution of the edges in the one mode picture of the nodes of type II. The 
parameters are t = 1000, A = 1 (squares), A = 0.5 (circles) and A = (triangles). The guideline 
shows a power law behavior of exponent 2. 

IV. ROLE OF LIMITED LIFETIME 

In this section we are going to investigate the effect of limited age of the nodes on the 
behavior of the distributions studied in the previous section. To this end we assign a lifetime 
to each type of the nodes. That is a node will be active only during its life which is t* or 
T* according to the type of the node. It is the only feature that we add to the model 
studied above. In this way only the active nodes of each type will have the opportunity to 
be selected for duplication. Moreover the new node can only establish connections with the 
active nodes of the other type. Evolving the network in this manner, the number of active 
nodes of each type during the evolution will be always less than or equal to the assigned 
lifetimes. Nevertheless the total number of nodes of each type is as before equal to t. To 
see the role of the limited ages we consider the case of A = 1 with i) t* — oo and ii) t* = T*. 
Since the qualitative behavior of interesting distributions is the same, we shall only focus 
on E(w), the weight distribution of the edges in the one mode picture of the nodes of type 
II. Again as the initial condition we start with a node of type I which has been connected 
to a node of type II. In Figs. El and [3 we show the above distribution for some values of T*. 
As Fig. IH1 shows, by decreasing T* the general behavior of distribution dose not change and 
even its exponent remains nearly constant. Of course the number of edges with weight zero 
increases as required by the conservation of the probability. However in Fig(7| we see that 
by decreasing T* the power law behavior of E(w) slowly converts to an exponential decay. 
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FIG. 6: Weight distribution of the edges in the one mode picture of the nodes of kind II when 
t* = co. The parameters are t = 1000, A = 1, T* = 500 (squares), T* = 200 (circles) and 
T* = 50 (triangles). The guideline shows a power law behavior of exponent 2. 
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FIG. 7: Weight distribution of the edges in the one mode picture of the nodes of kind II when 
t* = T*. The parameters are t = 1000, A = 1, T* = 500 (squares), T* = 200 (circles) and 
T* = 50 (triangles). The guideline shows a power law behavior of exponent 2. 

V. CONCLUSION 



In summary we have shown how weighted scale free networks could be generated by 
the evolution of bipartite networks which their evolution is based on a well known rule of 
biology, that is duplication of the nodes. We showed that by tuning A which controls the 
rate of duplication of the nodes of different types, one can go from a power law regime to an 
exponential one where the tail of the distributions fall off exponentially. In this model the 
exponents of interesting distributions are less than or equal to 2 and this is close to what 
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seen in the real weighted networks. We also studied the effect of limited age for the nodes 
and showed that a short lifetime may destroy the power law behavior of distributions. 
We emphasize that the simple model studied in this paper is a toy model and far from the 
evolution of the real networks. Nevertheless its success in generating scale free distributions 
for the important quantities of the weighted networks, indicates to the essential role of 
duplication mechanism in the evolution of complex weighted networks. Certainly one can 
enrich the above model, e.g. by introducing mutation to the model , to get a more realistic 
evolution. 

Acknowledgments 

The author is grateful to V. Karimipour for careful reading of the manuscript and useful 
suggestions. 



[1] 

[2] 

[3] 
[4] 
[5] 
[6] 

[7 

[8 

[9 
[10 

[11 

[12 

[13 



R. Albert and A.-L.Barabasi, Rev. Mod. Phys. 74,47-97 (2002). 

S.N.Dorogovtsev and J.F.F.Mendes, Evolution of Networks : From Biological Nets to the 

Internet and WWW, (Oxford University Press, 2003). 

M.E. J.Newman, SIAM Review 45, 167-256 (2003). 

D.JWatts and S.H.Strogatz , Nature 393,440 (1998). 

A.-L.Barabasi and R. Albert, Science 286, 509 (1999). 

LA.NAmaral, A. Scala, M. Barthelemy, and H.E.Stanly, Proc. Natl. Acad. Sci USA 

97,11149(2000). 

R.Albert, H.Jeong, and A.-L.Barabasi, Nature 406, 378(2000) 

M.E. J.Newman, Phys. Rev. Lett. 89, 208701 (2002). 

H. Jeong, Z. Neda and A.-L. Barabasi, |cond-mat/0104131 



A. Barrat, M. Barthelemy, R. Pastor-Satorras and A. Vespignani, Proc. Natl. Acad. Sci. USA 

101, 3747 (2004) 

A. Barrat, M. Barthelemy, A. Vespignani, Phys. Rev. Lett. 92, 228701 (2004). 

A.-C. Gavin et al., Nature 415, 141-147(2002). 



A. Mashaghi, A. Ramezanpour and V. Karimipour, cond-mat/0304207. 



14 



[14] M. E. J. Newman, Proc. Natl. Acad. Sci. USA 98, 404-409 (2001) 

[15] A.L. Barabasi, H. Jeong, Z. Neda, E. Ravasz, A. Schubert and T. Vicsek, Physica A 311, 

(3-4) 590-614 (2002). 
[16] J. J. Ramasco, S. N. Dorogovtsev and R. Pastor-Satorras, cond-mat/0403438 



[17] M.E.J.Newman, S.H.Strogatz and D.J.Watts, Phys. Rev. E 64, 026118 (2001). 
[18] A.Wagner, Mol.Biol.Evol. 18(7): 1283-1292(2001). 

[19] R. V. Sole, R. Pastor-Satorras, E. Smith and T. B. Kepler, Advances in Complex Systems 5, 
43 (2002). 



15 



